Tree Structured Data Analysis

نویسنده

  • Leland Wilkinson
چکیده

Classification and regression trees are becoming increasingly popular for partitioning data and identifying local structure in small and large datasets. Classification trees include those models in which the dependent variable (the predicted variable) is categorical. Regression trees include those in which it is continuous. This paper discusses pitfalls in the use of these methods and highlights where they are especially suitable. Paper presented at the 1992 Sun Valley, ID, Sawtooth/SYSTAT Joint Software Conference.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimating Tree-Structured Covariance Matrices via Mixed-Integer Programming with an Application to Phylogenetic Analysis of Gene Expression

We present a novel method for estimating tree-structured covariance matrices directly from observed continuous data. A representation of these classes of matrices as linear combinations of rank-one matrices indicating object partitions is used to formulate estimation as instances of well-studied numerical optimization problems. In particular, we present estimation based on projection where the ...

متن کامل

An Introduction to Functional Data Analysis of Populations of Tree-structured Objects

This paper proposes a new method for understanding the structure of populations of complex objects in the area of medical image analysis. The new methods require invention of approaches to the statistical analysis of a population of tree-structured objects. The approach is based on a metric in tree space. The metric provides a foundation for defining a notion of population center. In Functional...

متن کامل

A Polynomial Time Matching Algorithm of Structured Ordered Tree Patterns for Data Mining from Semistructured Data

Tree structured data such as HTML/XML files are represented by rooted trees with ordered children and edge labels. Knowledge representations for tree structured data are quite important to discover interesting features which such tree structured data have. In this paper, as a representation of structural features we propose a structured ordered tree pattern, called a term tree, which is a roote...

متن کامل

Hierarchical Pairwise Data

Partitioning a data set and extracting hidden structure arises in diierent application areas of pattern recognition, data analysis and image processing. We formulate data clustering for data characterized by pairwise dissimilarity values as an assignment problem with an objective function to be minimized. An extension to tree{structured clustering is proposed which allows a hierarchical groupin...

متن کامل

STEED: An Analytical Database System for TrEE-structured Data

Tree-structured data formats, such as JSON and Protocol Buffers, are capable of expressing sophisticated data types, including nested, repeated, and missing values. While such expressing power contributes to their popularity in real-world applications, it presents a significant challenge for systems supporting tree-structured data. Existing systems have focused on general-purpose solutions eith...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000